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ABSTRACT 

Curve of growth analysis, applied to the Lyman series absorption ratios 
deduced in our previous paper, yields a measurement of the logarithmic slope of 
distribution of Lyman a clouds in column density N . The observed exponential 
distribution of the clouds' equivalent widths W is then shown to require a broad 
distribution of velocity parameters b, extending up to 80 km s _1 . We show how 
the exponential itself emerges in a natural way. An absolute normalization for 
the differential distribution of cloud numbers in z, N, and b is obtained. By 
detailed analysis of absorption fluctuations along the line of sight (including 
correlations among neighboring spectral frequency bins) we are able to put 
upper limits on the cloud-cloud correlation function £ on several megaparsec 
length scales. We show that observed b values, if thermal, are incompatible, in 
several different ways, with the hypothesis of equilibrium heating and ionization 
by a background UV flux. Either a significant component of b is due to bulk 
motion (which we argue against on several grounds), or else the clouds are out 
of equilibrium, and hotter than is implied by their ionization state, a situation 
which could be indicative of recent adiabatic collapse. 



Subject headings: cosmology: observations - quasars - intergalactic medium 
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1. Introduction 

In a previous paper (Press, Rybicki, and Schneider 1993, hereafter Paper I) we analyzed 
the statistics of the Lyman forest absorption, as it is seen against the UV continuum of the 
background quasars constituting the Schneider-Schmidt-Gunn high-redshift sample, whose 
redshifts z range from 2.5 to 4.3 (Schneider, Schmidt, and Gunn, 1991; hereafter SSG). 
We obtained an empirical fit for mean transmission as a function of redshift that agrees 
well with previous determinations at somewhat lower redshifts, and were additionally able 
to measure the mean ratios of absorption due to the lines Lyman a, Lyman (3, Lyman 
7, Lyman 5, and possibly Lyman e. We also developed some formalism for interpreting 
fluctuations in the transmission about its mean, and indicated that these fluctuations were 
close to, but perhaps somewhat larger than can be explained by a random distribution of 
clouds. 

Paper I restricted itself to conclusions which followed directly from the SSG data, 
essentially without the introduction of modeling assumptions. This paper broadens the 
analysis in several respects: 

First, we will take as input to the analysis not only the SSG data as studied in Paper 
I, but also some other published data, in particular the statistical distribution of equivalent 
widths in the redshift range 1.50 < z < 3.78 reported by Murdoch et al. (1986), and the 
statistical distribution of velocity parameters b shown as preliminary data by Carswell 
(1989; see his review for further attribution of this published and unpublished material). 

These additional data sets derive from high resolution studies that involve the 
identification, counting, and detailed fitting of model line profiles to the Lyman a lines; we 
will call these results, generically, "line counting" measurements. Our results in Paper I, 
by contrast, did not depend on the identification of any single line, but only on aggregate 
statistical properties of the Lyman forest. In attempting to derive uniform conclusions from 
a merger of these two very different kinds of data, our challenge is to rely minimally on 
features of a data set that are most susceptible to systematic error. 

Second, we will eventually make some standard assumptions about the microscopic 
state of the gas in the clouds. In particular, we will assume that the gas consists of hydrogen 
and helium with a primordial cosmological abundance, and that its temperature and 
ionization state are determined by local interaction (though not necessarily equilibrium) 
with a posited background UV flux, conventionally, though not necessarily, taken to be due 
to the continuum emission of high-redshift quasars. 

The plan of this paper is as follows: In Section 2, we apply standard curve-of-growth 
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analysis to the measured ratios of equivalent widths determined in Paper I. We are able 
to determine (in the context of certain parametric models) the number distribution of 
cloud neutral column densities N without any use of line counting measurements. This 
determination is highly insensitive to the assumed distribution of velocity parameters 
b; thus, while we cannot say anything about b from the data of Paper I alone, our 
characterization of the N distribution is almost completely independent of any assumptions 
about b, which is not the case in line counting measurements. 

In Section 3, we show that the line counting data of Murdoch et al. (1986) can be 
fit, virtually perfectly, by the column density distribution that we determined in Section 
2, if it is independently combined with any one of several simple model distributions for 
b. We then compare the measured b distribution given by Carswell (1989) to the class of 
acceptable model distributions. At this point in the analysis we are able to determine an 
absolute normalization on the density of clouds, which we compare to previous published 
values. We are also able to explain the origin of the the empirical exponential distribution 
of equivalent widths introduced by Sargent et al. (1980). 

In Section 4 we revisit the question of fluctuations in the absorption around its 
mean, last addressed by us in Paper I. With the detailed number distribution in both N 
and b of the previous section, we use Monte Carlo techniques to generalize the analytic 
calculations of Paper I. We examine both the variance and the autocorrelation (in redshift) 
of the absorption for signs of an underlying two point correlation function £ in the cloud 
distribution. 

Sections 5 and 6 ask the question, 'Where are all the baryons?" That is, we investigate 
whether, with the distribution of iV and b already determined, it is possible to hide all 
of a plausible cosmological density of baryons in the population of clouds that is already 
known to exist at high redshifts (e.g., in the SSG sample's Lyman forest). We find that, 
with almost no model assumptions (that is, without assuming an internal structure, or 
even a physical size, for the clouds), it is possible to do so. Indeed, the problem is how 
to avoid cosmologically too many baryons in the clouds: if the clouds are in thermal and 
ionization equilibrium, and if the measured b values represent thermal velocities, then 
neutral hydrogen fractions are so small that the observed neutral absorption implies an 
overfilled universe. 

The most likely solutions are either that a significant component of the velocity 
parameter b is not thermal ("bulk motion hypothesis"), or that the clouds are in an 
out-of-equilibrium thermal state with respect to their ionization level, perhaps due to 
adiabatic heating from recent collapse ("non-UV heating hypothesis"). We discuss pros and 
cons of the two solutions. In either case, there is no cosmological necessity for any nonzero 
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Gunn-Peterson (1965) effect. 

Section 7 summarizes our conclusions. 



2. Curve of Growth Analysis 



Although curve of growth analysis is standard textbook fare, we are applying it in an 
unusual range of parameters, so it is worth stating the basic equations used. For a line 
with central wavelength Ao and central frequency u , the Doppler widths in wavelength and 
frequency are defined by 

AA D = -A ; Az/d = -u (1) 

c c 

where c is the speed of light and b is the velocity parameter, given, in the case of thermal 
velocities only, by 

(2kT\ 1/2 



\m H J 



(2) 



For a given Lyman line of interest, in terms of the atomic physics parameters / (the 
oscillator strength or /-value of the transition) and T (the natural decay rate of the line, 
— Uni Am/) one defines a Voigt parameter, 

a = F = ^ (3) 
a 4ttAv d Aixb 1 ' 

and an "integrated line optical depth," 

To = ifl i Nf = if! (^l) (4) 

m e c Az/d m e c \ b J 

where is the column density of neutral hydrogen (assumed to be principally in its ground 
level). In the usual case where a< 1, the integrated line optical depth r is y/ir times the 
line center optical depth. 

Lyman alpha clouds along the line of sight to the quasar produce absoption lines by 
pure extinction, so the residual intensity is given by 

r(A) = Fc ~/ (A) = l-e->^), (5) 



- 5 - 



where F(X) is the measured flux, F c is the continuum flux, and x = (A — A )/AAz> The 
normalized Voigt function is defined by 

a f°° exp(-t 2 ) 

This particular normalization makes J °° {7 (x, a) c?x = 1. 

The equivalent width W of the line in wavelength is then given by 



W 



r r(A) dX = ^£ r fl _ e -^(^)) dx> (7) 

JO C J-oo V 7 



which may be computed by direct numerical quadrature. 

Figure 1 shows the result of applying this prescription to the first five lines of the 
Lyman series (at, /3, 7, 5, e) over the range 10 13 cm -2 < N < 10 22 cm -2 , and for three 
values of b that will turn out to be of interest (30, 50, 70 km s -1 ). At the very lowest 
(unsaturated) and highest (damped) column densities, one sees that the equivalent widths 
have, essentially, the ratios of their respective oscillator strengths, and they are independent 
of b. In the intermediate (saturated) region, the dependence on Doppler width is significant 
and approximately linear in b. The dependence on N is linear in the unsaturated regime, 
slowly varying (~ log 1 / 2 N) in the saturated regime, and varying as iV 1//2 in the damped 
regime. (See Spitzer 1978, §3.4, for further details.) 

Let us look now at the implications of Figure 1 to the Lyman a clouds. Sargent (1988) 
states straightforwardly an observational consensus that (i) the typical column density of 
the clouds is N ~ 10 15 cm -2 , and (ii) there is a power law distribution of column densities 
of the form 

f(N)dN oc N~ p dN (8) 

with (3 « 1.5. Indeed, there is a lively debate (into which we are about to put an oar) on 
the value, and possible non-constancy, of (3 as N varies over 10 orders of magnitude, from 
10 12 - 10 22 cm" 2 (see, e.g., Tytler 1987, Bechtold 1988, Carswell et al. 1984, Carswell et al. 
1987, Hunstead 1988, Rauch et al. 1992, Petitjean et al. 1993). Before proceeding, however, 
we need to take note of the obvious fact that consensus points (i) and (ii) above, if they 
are taken as fundamental features of the clouds rather than as artifacts of observational 
selection, are mutually incompatible: a pure power law has no typical value! 

Figure 1, on the other hand, shows that the value N ~ 10 14 cm -2 is a special value 
observationally - it is where the Lyman a line becomes saturated (for reasonable values 
of b). Integrating the pure power law of equation (|8]) with the equivalent width shown in 
Figure 1, one sees that the total absorption will diverge for N ^> 10 14 cm -2 if j3 < 1 and will 
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diverge for N <C 10 14 cm -2 if (3 > 2. For (3 in the allowed range 1 < (3 < 2, the "typical" 
cloud responsible for absorption will have iV ~ 10 14 cm -2 . In this paper, and in Paper III of 
this series, we will several times focus on the question of whether there is any good evidence 
for a change in the properties of the clouds at iV ~ 10 14 — 10 15 cm~ 2 . In general, we will see 
that the evidence is weak. 

We can thus turn with renewed interest to the question of determining the exponent 
(3. In Paper I, we obtained values, and a full covariance matrix, for the ratios 

Wp/Xp W 7 /A 7 Ws/Xs w e /x € 

W a /X a ' W a /X a } W a /X a ' W a /X a [ ' 

We now ask whether there is any single value of (3 that, convolved with Figure 1, becomes 
statistically consistent with the measured ratios. The figure of merit is the \ 2 value, 

x 2 = i:{R:-R»w{R*-R«) (io) 

where Ri = {WiX a ) / (W a / A,) , i = f3,"f,S,e, is a measured ratio from equation (|9]), is the 
measured covariance matrix (see Paper I), and R® is the theoretical value 

1 JdNN-?W a (N,b)/X a { ] 

For a correct model, the value of \ 2 should be distributed as the standard chi-square 
function with 4 degrees of freedom. 

Figure 2 shows the result, for three different values of b. (In Section 3 we will introduce 
distributions of b instead of single values.) One sees that the exponent (3 is determined 
to lie in the same narrow range, approximately independent of b. The values of \ 2 & t the 
minima are quite consistent with 4 degrees of freedom, lending credibility both to the power 
law model and to Paper I's resampling estimates of the covariance matrix (which had no 
knowledge of the atomic physics embodied in Figure 1). Recalling that a l-a error estimate 
corresponds to a change A% 2 = 1, we can summarize the measured value and uncertainty 
(including both statistical uncertainty and uncertainty due to the value of b) as 

P = 1.43 ±0.04 (12) 

In the next section we will be more explicit about the range of iV that contributes to this 
determination, but it should be clear from inspecting Figure 1, and from the fact that 
(3 ~ 1.5, that the range iV ~ 10 14 — 10 15 cm~ 2 contributes most. 



It is possibly a debatable question as to whether value in equation (TT2] ) is consistent 
with previous determinations that rely on the counting of individual clouds. Hunstead 
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et al. (1987a; see also Hunstead 1988) obtain (3 = 1.57 ± 0.05 fitting to a range 
13.25 < log 10 iV < 16, and find that a single power law is a good fit. This contrasts with 
Carswell et al. (1987), who found evidence of a break at log 10 iV pa 14.35. Petitjohn 
et al. (1992) report (3 = 1.49 ± 0.02 as the best single power law fit over the range 
13.7 < log 10 N < 21.8. However, they obtain a somewhat steeper value (3 = 1.83 ± .06 for a 
fit restricted to 13.7 < log 10 iV < 16. 

Considering the completely different nature of our method (not counting clouds clouds, 
and using Lyman lines higher than a), the fact that our method is weighted differently in N, 
and the somewhat different redshift ranges, we think that all the above quoted errors should 
be taken with a grain of salt. In favor of our method is the fact that it is not susceptible to 
observational biases in the efficiency with which one identifies, in noise, the very different 
profiles of a weak line (at N = 10 13 cm~ 2 , say) and a saturated line (at N = 10 16 cm~ 2 , say). 

It must quickly be said, however, that Paper I's measured ratios of equivalent widths, 
taken by themselves, do not rule out broken power law models a la Carswell et al. (1987). 
On the contrary, Figure 3 shows the results of fitting a broken power law model 



for the column density of the break N . (The exponents in this model come from a specific 
theoretical construction that will be detailed in Paper III. In this paper we will refer to this 
model simply as "Model X".) One finds that, for any given value of b, Model X fits the 
data embarrassingly well, with x 2 ~ 1 f° r 4 degrees of freedom. (This is bound to happen 
about 1 time in 20 by chance. The different values of b are highly correlated and are not 
independent embarrassments.) The value of the break N is in the range 0.8 — 2 x 10 15 cm -2 . 

Figure 3 also shows, as the three "narrow" (approximate) parabolas, the result of 
fitting to the (completely unrealistic) delta-function model in which all clouds have a single 
column density Nq. One sees that, for each assumed b, there is a narrow window in iVo that 
is marginally allowed. We consider this an artifact, a kind of "x 2 leak", and expect that any 
slightly better (smaller statistical errors) determination of the equivalent width ratios would 
eliminate this - and hopefully not eliminate the power law or broken power law models. 

To summarize: the equivalent width ratios measured in Paper I are fairly powerful at 
fixing one parameter within a model (e.g., either a single exponent, or the position of a 
single break), but not powerful at distinguishing among different models. In search of such 
distinctions we move, in the next section, to consider other data. 




N <N ( 
N > N ( 



o 



o 



(13) 
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3. The Distribution of Equivalent Widths 



A fundamental observational feature of the Lyman a clouds, first noted by Sargent 
et al. (1980), is that their equivalent widths W are exponentially distributed (at least for 
W > 0.3A or so) as 



Murdoch et al. (1986) give a compilation of data, including some from Carswell et al. 
(1984) and Atwood, Baldwin, and Carswell (1985), all normalized to a fiducial redshift 
z = 2.44, and obtain a fitted value W* = 0.278 ± 0.018A Figure 4 shows as filled circles 
the full set of data given in Murdoch et al. (Pay no attention to the curves in the figure, 
for now.) 

One should particularly take note of the distinct departure of the data from a pure 
exponential at small equivalent widths W ^ 0.2. Jenkins and Ostriker (1991), among 
others, note that the apparent upturn in the number of clouds with small equivalent widths 
can be largely explained by the transition from unsaturated to saturated lines (Figure 1). 
We will want to see if "largely" can also mean "completely" . 



The quantities N (column density) and b (velocity parameter) are physically more 
fundamental than W, which follows from them (as Figure 1 indicates). Since we have 
in hand, from Section 2, a determination of the distribution of N that is independent of 
all line counting techniques, we can now try to invert the data in Figure 4 and obtain a 
distribution function for b. That distribution can then be compared to the distributions 
that are obtained by line profile fitting (which, as we shall see, is a somewhat controversial 
subject). This method of determining b is not really independent of line counting methods, 
since we must use the equivalent width data of Murdoch et al. (or some other similar 
data set); but it does provide an useful consistency check on some diverse data reduction 
methods. 

There is controversy about b on at least two issues, one of which we regard as 
observational, the other theoretical. The observational issue is whether (as claimed by 
Pettini et al., 1990) the distribution in b is peaked at "small" values, with a median b value 
of 17 km/s, and essentially no values of b greater than 30 km/s, or whether (as claimed by 




(14) 



3.1. 



The Distribution in b 
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Carswell, 1989, Rauch et al., 1992, and others) the distribution in b is much broader, with 
a mean value in the range 30-40 km/s and a tail extending significantly higher in velocity. 
The theoretical issue, which is especially important if the b distribution is in fact broad, is 
whether the measured values of b are indicative of thermal Doppler velocities, from which 
conclusions can be drawn about the microphysical state of the clouds, or whether b includes 
a signficant component due to bulk motion (in which case a host of other theoretical 
questions are raised). 

As with many inversion problems, it is better to fit a parametric model than to try 
to invert ab initio. To better understand the sensitivities of the inversion, we will try two 
models. The first is a simple truncated Gaussian, 



p{b)db oc 



exp 




(b - bof 



db 



b>0 
b<0 



(15) 



Here bo is the mode of the distribution (because of the truncation, not the mean), while 
b* parametrizes the standard deviation. The second model is a gamma distribution of the 
form 

p{b)db oc ^o/ 6 *)- 1 exp{-b/b*)db, b > (16) 
where bo is the mean of the distribution, b* again parametrizes the standard deviation. 

We at this point make the assumption that b and N are independently distributed (this 
important point will be discussed further in Section 5, below) and we assume that N is 
distributed as a pure power law with (3 given by equation ([12]). Writing the transformation 
of probabilities in the form 



p{W) 



dN 



dN 



db 



dbp(N, b) S[W — W(N, b)} 
p(N,b) 



dW(N, b)/db 

p(N,b) 
dW{N, b)/dN 



b=b(N,W) 



N=N(b,W) 



(17) 



we perform the the integral (either one of the last two forms) using the theoretical curve 
of growth function W(N,b), and we compute a \ 2 goodness of fit measure to the data 
in Murdoch et al., using the error bars given there. We then minimize x 2 f° r the two 
parameters bo and b* in each of the two models ( |15|) and flTBl) . The results, along with some 
values that characterize different moments of the fitted distributions (useful later) are given 
in Table 1. The long- and short-dashed lines in Figure 4 show the result of computing 
equation (p!7[) for the best fitting models. The agreement is seen to be excellent for both 
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models. (The x 2 P er degree of freedom is significantly less than unity, suggesting that at 
least some of the error bars shown in Murdoch et al. 1986 are overestimated.) 

Figure 5 shows (again as long and short dashes) the shapes of the best-fitting model 
distributions. Also shown in that figure, as a histogram, are the data for the distribution 
of b given in Carswell (1989). While Carswell cautions that the data (derived from several 
sources) is preliminary, and rightly warns of the difficulty of deriving such data from line 
profile fitting, we see that the basic shape and parameter values of the Carswell data are in 
good agreement with our fitted models that derive from the equivalent width distribution 
of Murdoch et al. (1986). We therefore include the Carswell data as an additional line in 
Table 1. The last line of that Table gives mean values over the previous lines (two fitted 
models plus one actual data set) and the spread of the models, which can be taken as an 
indication of the model uncertainty. (The standard errors of the fitted estimates for bo and 
b* are also comparable, around 2 km/s.) 

We have also integrated Carswell's data on b, via equation (|T7j) and the f3 value of 
equation (|T2"D, to get its predicted distribution of W . This is shown as the solid line in 
Figure 4. Note that there are no new adjustable parameters in this calculation, except for 
the overall normalization (moving the curve up or down). One sees that the result is in 
excellent agreement with the Murdoch data, virtually as good as the other two models. 

As is obvious by Figure 1, it is the large b tail of the distribution that produces the 
large W exponential tail. While it is conceivable that the Carswell and Murdoch data 
sets, both obtained by line counting techniques, share a common tendency to misidentify 
blends, or lines from metal-line clouds, as Lyman a lines of large W, it seems quite unlikely 
that the general shape of the exponential tail in W, independently verified measured by 
many observers from Sargent et al. (1980) to the present and in several different redshift 
ranges, could be wholly an artifact of such misidentifications. We conclude that the high-6 



Distribution 


best b 


best b* 


(b) 


o-b 


^8.54^1/8-54 






(all values in 


km s _1 ) 




Gaussian 


32 


23 


36 


20 


61 


Gamma 


38 


14 


39 


23 


77 


Carswell data 






36 


18 


70 


Mean and Spread 






37 ±2 


20 ±3 


69 ±8 



Table 1: Fitted parameters for the distribution of velocity parameters b (see text). 
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tails of the distributions in Figure 5, with b's in the range 40 to 80 km/s, are real and 
observationally mandated by the exponential distribution of W's. 

Thus, as regards the observational controversy mentioned above, we must be firmly on 
the side of a broad b distribution. In this, our conclusion agrees with Rauch et al. (1992), 
who discuss possible sources of discrepancy in the Pettini et al. data. 

Even if we exclude from consideration the long exponential tail in the W distribution, 
our data do not support entirely small b values: In Figure 4 we have plotted, as a dotted 
line, the best-fitting curve that derives from a single value (delta function distribution) 
of b. The best-fitting value is 33 km/s (cf. Table 1). One sees that the equivalent width 
distribution for W < 0.4A is reproduced tolerably well (though not as well as any of the 
broader distributions), but that there is a severe shortage of large W values. 

In fact, the larger tail values for W are produced, we find, by clouds that have both 
larger b values and larger N values. Figure 6 shows results from the same integrations that 
produced the curves in Figure 4. At each equivalent width W, we show the mean iV of 
the clouds that contributed to that value of W, independent of their values b. One sees 
that values N up to ~ 10 16 cm~ 2 are directly probed by the data points in Figure 4. Thus, 
Figure 4 yields strong consistency check on the value, and constancy of (3 in the power law 
distribution of iV: Its value, derived from equivalent width ratios of the Lyman series, came 
primarily from the saturation bend in Figure 1, at iV ~ 10 14 cm -2 . Now, we see (Figures 4 
and 6) that the correct number of clouds at iV ~ 10 16 cm -2 is obtained well within a factor 
of 2. 

To emphasize this point, we have plotted, as the final curve in Figure 4, the result 
of integrating the broken power law Model X (equation [13]) with the observed Carswell 



distribution of b. One sees that it falls significantly below the data both at small equivalent 
widths and at large equivalent widths, demonstrating the apparent relative constancy of j3, 
at least over three orders of magnitude in N. 

Now to answer another question raised at the beginning of this section, we can see 
that the upward bend in the number of clouds at small equivalent width (Figure 4) can be 
explained completely as the effect of folding the standard curve of growth (Figure 1) into a 
constant (3 (pure power law) model. (This, incidentally, suggests caution in approximating 
that part of the distribution as another exponential, as in Jenkins and Ostriker 1991, since 
it is actually a power law that diverges at zero.) To summarize, we find no feature in any of 
the data sets examined that conflicts with a pure power law distribution for N; no data set 
requires, or even unambiguously indicates, a "typical" iV value of 10 15 cm -2 , or any other 
value, for the underlying cloud population. 
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Nothing yet in our discussion takes a position on what we have above labeled the 
theoretical controversy surrounding b, i.e., indicates whether the observed b's derive purely 
from thermal Doppler broadening, as opposed to an additional component due to bulk 
motions. We will return to this point below. 



3.2. The Distribution of Clouds in N, b, z 



We can make explicit at this point a "standard" working model for the distribution 
of the Lyman a clouds in N, b, and redshift z at high redshift: Let N14 denote the value 
iV/10 14 cm -2 , and let n(N,b, z) dN u dbdz denote the mean number of clouds of in an 
interval dN u , db, dz. Then, 

n(N, b, z) = JV (1 + z)m u Pp(b) (18) 
where p(b), now assumed to be normalized to unity, 

p(b)db = 1 (19) 



has the shape of any of the three distributions in Figure 5 (for example, equations [15 



or 



0.0037(1 + z) l+ ~< = f b ^ Af (l + z yN u p p(b) dNdb (20) 

J An 



16| with parameters from Table 1). Our best estimate of (3 (equation [12]) is 1.43. Our 
best estimate of 7 (Paper I) was 2.46. We can estimate Ao by equating the observed 
mean transmission (Paper I) to the integral of equation (pj|) weighted by the appropriate 
equivalent widths, namely 



Doing the integral numerically for the three b distributions in Figure 5, we obtain the 
absolute normalizations 

Nq = (2.57, 2.54, 2.63) (21) 

where the three values are using the observed (Carswell), gamma, and Gaussian 
distributions, respectively. The values are not, of course, as accurate as the number of 
significant figures shown, but are given to illustrate how little uncertainty is due to our 
remaining ignorance of the b distribution. The units of equation ([H]) are "clouds per unit 
redshift per unit AiV = 10 14 cm -2 at iV = 10 14 cm -2 " . Most of the error in determining Ao 
comes from the highly correlated errors in 7 and the value 0.0037, as described in Paper I. 
For redshifts in the range 3 < z < 4, we estimate the overall error of equation ( |2T| ) to be on 
the order of ±10%. 
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To compare with previous results by other investigators, we have also computed 
numerically the number of clouds with equivalent widths W > 0.32A (that value being 
conventional in the literature) per unit redshift, a quantity conventionally denoted N . We 
obtain 

N Q = (4.2±0.5)(l + z) 7 (22) 

(for redshifts 3 < z < 4) which can be compared to the estimate of Murdoch et al. (1986) 
of 4.06 (no error given), and Jenkins and Ostriker's (1991) estimate of 9 ± 3, obtained from 
Murdoch's data for n(W) (the data points in our Figure 4) and their own measurements of 
the Lyman a broadband decrement Da- 



3.3. Upper Limits to Size Derived from Spacing 



Knowing an absolute normalization on the distribution in N, b, z, we immediately get 
some upper limits to the size of the clouds: Since the number of clouds decreases rapidly 
with increasing N, the average size of clouds with column density ^ N must be less than 
the average spacing along the line of sight of such clouds. There is, of course, no reason to 
think that this bound lies close to the true cloud sizes. 

Distance along the line of sight is variously parametrized by the redshift separation dz, 
the separation in observed wavelength d\ b s , the separation in comoving distance dr, or the 
separation in physical (proper) distance dr p . The relation among these quantities is 

(1 + z)dr - dr- 3000MpC h — ^ - 3000M P C h ~" dz (23) 
(1 + z)dr v -dr- {1 + z){1 + Qz)1/2 Aq " (1 + ^(1 + ^)1/2 dz W 

where h parametrizes the Hubble constant Hq by h = km s _1 Mpc _1 . Integrating 

equation fll~8|) in both b and N, the number of clouds with column densities greater than iV 
in a redshift interval dz is 

dn >N = "—^(l + zydz (24) 

(/3-i)ivf 4 - lV ; v ; 

Equations ( f23"D and (|24D imply a limit on the physical size r p h ys of clouds 

dr p 3000 h- 1 Mpc {£ - l)iVf 4 _1 1 
rphys < d^ ~ Af (1 + ^)7+2(1 + ^)1/2 W 

Evidently the tightest limit is obtained by applying the argument at the highest redshift 
and to the smallest-iV clouds. In Paper I, we saw that the value 7 ~ 2.46 was justified up 
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to a redshift of at least z ~ 4.2, while Figure 6, above, shows that the observed equivalent 
width distribution of clouds justifies the value (3 ~ 1.43 down to at least N = 10 13 cm -2 , i.e., 
A^ 14 = 0.1. Taking these values, and Ao from equation fl2"ip above, we get, for N ~ 10 13 cm -2 
clouds at z ~ 4.2, 

J 120/i- 1 kpc if Q = 
rphys< \ 50 /i" 1 kpc iffi = l { } 

Sargent (1988) summarizes the more direct evidence for the size of the clouds that 
comes from studies of close (or lensed) lines of sight at somewhat lower redshifts. It is 
believed that "typical" clouds (which we would infer to be at iV ~ 10 14_15 cm -2 ) have sizes 
in the range ~8 to ~20 kpc; however, these values are fairly uncertain. We return to the 
issue of cloud size in 85. 



3.4. Where Does the Exponential Come From? 



We have already seen in the three best-fitting curves of Figure 4 that the "standard" 
model of equation (|I8|) , with the parameter values given, is able to reproduce the long- 



observed exponential tail in the distribution of W, equation (14]). One might well wonder 
how this manages to be true, since (if we momentarily exclude the gamma distribution 
model for b) there are apparently no linear exponentials anywhere in the model: the 
distribution in N is power law, the distribution in b is (or can successfully be taken to be) 
Gaussian, and the curve of growth varies in the saturated regime as In 1 / 2 N. The most true, 
but least illuminating, "explanation" is simply that the numerics of integrating together 
these three functional forms happens, readily, to produce something very close to - though 
not exactly - a linear exponential in W. 

A more illuminating, though necessarily crude, explanation can be found in a toy 
analytical calculation which actually gives about the right value for the exponential scale 
W* in equation (|TJ): From equation (^), we note that for small values of r the equivalent 
width is approximately W = AXj^tq] this is the "linear" part of the curve of growth, where 
W increases linearly with N. This linear growth saturates when the line optical depth 
becomes of order unity, or when r = 1. This implies a saturation value 

W sat = A\ D (27) 

which occurs at the saturation column depth of 

m e c b , , 

N sat = ^ Y1 (28) 
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For N larger than N sat , we shall approximate the "flat" part of the curve of growth literally 
by taking the equivalent width to be the constant W = W sat . We shall ignore the so-called 
"square root" part of the curve of growth, which occurs for much larger values of N, as it 
appears from our numerical work that this part does not influence the results for the W's 
we are concerned with. Our approximate relation for the equivalent width is then 

W(N, b) = AX D mm(N/N saU 1) = A mm(N, Bb) (29) 

where the quantities A and B are defined by 

A = ^Xlf; B = (30) 

m e c 7re 2 Ao/ 



Now starting with the first form of equation Q17|) , we have 
p(W) oc f dN f db8\W - Amm{N,Bb)]N- p p{b) 

roc roc roc roc 

= / db5(W -ABb)p(b) / dN N~ p + / dN 8{W - AN)N~ P / dbpCb) 

JO JBb JO Jn/B 

1 / W \ r°° a 1 fW\~ 13 r°° 

- ~AB P ( AS ) Iw/A dN N ~a(~a) Lb^ (3D 



We next choose the Gamma distribution (|16|) to represent p(b), along with the numerical 
values for the constants bo and b* given in Table 1. Then 

where a = b^/b* = 2.7 and T denotes the incomplete gamma function. We have also defined 
the quantity 

W* = ABb* = —b* (33) 

c 

For Lyman alpha (assuming a mean redshift factor of (1 + z) ~ 4.5) this implies W* = 0.18^6 
A, where b$ is the value of b* in units of 10 6 cm s' 1 = 10 km s _1 . 

In the asymptotic regime W/W* 3> 1, the incomplete gamma function can be replaced 
by its asymptotic value, which yields (omitting some overall factors), 



p(W) oc (W/W*) a ~ p e~ w/w * 



w/w*_ 

(W/W*) a ~ p e~ w/w % (34) 
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the first term clearly dominating. In the opposite limit W/W* <C 1, the second term 
dominates in equation fl32]), since T(a, W/W*) ~ r(a) and the factor (W/A)" 13 becomes 
very large. Thus 

p(W) oc (W/W*)~ p (35) 

It can be seen that this toy model shows more or less the right behavior. The large 
W limit (equation f34j|), dominated by the flat part of the curve of growth, is roughly 
exponential [with some deviation induced by the factor (W/W*) a ~ p « (W/W*) 1 - 3 ]. In the 
small W limit, dominated by the linear part of the curve of growth, the power law in N 
makes its presence felt, and the distribution has the singular behavior (W/W*)~P as W 
goes to zero. These two behaviors are clearly seen in figure 5. Furthermore, the predicted 
value of W* for the preferred value of b* = 14 km s _1 is W* = 0.25A, which, given the crude 
approximations made, is not too far from the value given after equation flT4p. It should be 
emphasized that the approximations made in this calculation are not well justified, and 
that the answer is justified by the accurate numerical integration, not by this calculation. 
However, this toy example does show how it is possible for an exponential like equation 



(14), long observed and not previously explained, to emerge from a featureless distribution 
like equation fll8|). It also explains the upturn at small equivalent widths as a consequence 
of the power law distribution of column densities. 



4. Fluctuations in the Transmission 



In Paper I we measured the variance of the transmission Q for the Lyman forest of the 
SSG quasar sample, and obtained 

Var(Q)/g 2 = 0.06 ± 0.01 (36) 

This value has no fundamental significance by itself - it depends in part on the particulars 
of SSG's spectroscopy, for example their slit resolution. In Paper I we developed an analytic 
model based on a power law distribution of N, like equation (|I8| ) in this paper, but with a 
fixed line width (not equivalent width) Wq. The result was 

Var(Q) _ 2W ( 

where 

k = (2-2 /3 " 1 )ln= = (2-2' 3 - 1 )r (38) 



-(e K -l) 

K 



(37) 
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and (3 has its same meaning as in equation flT8|). The instrumental resolution A is defined 
in general by 

A = (J wdX^ j J w 2 dX (39) 

where w(X) is the instrumental response to a monochromatic line as a function of wavelength 
(in the same frame as Wo is defined, either observed or at the absorption redshift). We 
noted in Paper I that, for the SSG data, equation ( j36|) and (|37D were in reasonably good 
agreement, though with perhaps some possibility that the required Wo was unreasonably 
large, i.e., that the variance might be larger than could be explained by Poisson statistics 
alone. 

We are now in a position to be more quantitative about whether there is a discrepancy, 
and also to look not only at the variance but also at quantities like the covariance 
(Q(X)Q(X + 5X)) which contain information on the two-point spatial correlation function of 
the Lyman a clouds. Suppose that £(s) is the two point correlation function of the clouds at 
an observed wavelength separation s. (Below, we will convert from wavelength separations 
to comoving cosmological distances.) That is, given one cloud, the probability of finding a 
second cloud separated from the first by an observed absorption wavelength difference s is 
the mean probability times 1 + £(s). 

Because of finite spectral resolution of the SSG data, and the fact that individual 
clouds are not counted, the quantity £(s) is not directly accessible to measurement. Instead, 
we can measure the covariance of the binned values of transmission, 



i (QiQi + j)-(Q 2 j) 
r 2 (Qj) 



(40) 



where the angle brackets average over J, the index of the lOA-separated bins of the SSG 
spectroscopic data. The factor of T~ 2 (where r is the mean optical depth) scales H to 
measure fluctuations in optical depth (that is, to fluctuations in number of clouds) rather 
than fluctuations in transmission: The two scalings are related by 

^ / \ . i . 5t 1 5Q . . 

Q ~ exp(— T) implying — ~ — — (41) 

r t Q 

If w(s) is the instrumental response (as in equation ^) above, then Sj (observable) is 
related to £(s) (underlying) by 

„ _ Jw*w(s)gs + JAX)ds 
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where A A is the bin separation (here 10 A), and w*w(s) is (a single function of s) the 
convolution of w with itself, 

w*w(s) = J w(s + y)w(y)dy (43) 

There are two interesting limits for equation fl42|) : If £(s) is slowly varying on a scale 
x 3> A (the instrumental resolution, equation K^), then S measures £ directly at wavelength 
separations JAA, 

^»{(JAA){£^ = {(JAA) (44) 
[Jw(s) 2 dsJ 

If, on the other hand, £(s) is nonnegligable only at values of s that are <C A (unresolved 
structure), then 

" J ~ A /u> 2 (s)ds 1 j 

so that, in particular, the untagged ordinary variance is given by 

n « — - — (46) 

while higher values Sj decay as the normalized convolution of the instrumental response 
with itself. (For a square slit, this convolution is a triangular response starting at unity and 
going to zero when s equals the full width of the slit.) 

It is sometimes useful to think of £(s) as having two additive parts, a Poisson piece 
£p that is some multiple of a Dirac delta function S(s) at zero lag, and which describes 
fluctuations due to the discreteness of the clouds, and a smooth piece £s that is generated 
by spatial variations in the underlying mean density of clouds. (See Peebles, 1980, §33— §36, 
especially equation 36.7.) Then equation (|37|), divided by r 2 , can be viewed as calculating 
the strength of the delta function piece £p in equation (|46|) . 

With this formalism in hand, we now need to upgrade equation (B7|) to a more realistic 
model of cloud line statistics, in particular to the model equation ( |TBD above. We have done 
this by Monte Carlo experiment. Repeatedly realizing equation QI8| ) with Poisson statistics 
along a simulated line of sight, we have accumulated statistics on how the resulting variance 



varies with mean optical depth (or with k in equation |37j). We find that the functional form 
of equation (|37|) is well reproduced, but with an "effective" value for W (in the observed 
frame, e.g., with A = 25A for the SSG data) 

W tt(5.5±l)^(l+z)\ (47) 
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The scaling with (b), Xq, and c are required by dimensional analysis. The factor (1 + z) 
puts Wo into the observed frame, for comparison with A. The Monte Carlo calculations 
verify (at about the 20% level) the shape of the dependence on r, and compute the overall 
constant. 

Using the numerical values (b) = 37km/s (Table 1), Ao = 1216A, A = 25A (SSG), 
(1 + z) = 4.5 and T = 0.67 (Paper I, and SSG's mean redshift), equations fl3~7|), fl3~8f) , and 

(0) g iv e _ 

Var(Q)/Q 2 = 0.076 (48) 

which is in fairly good agreement with Paper I's measured value (or equation |36| above), 
and which will prove to be in excellent agreement with a slightly different way of reducing 
the data, below. 

It appears, then, that there is no significant clumping of the Lyman a clouds at 
high redshifts z ~ 3.5 in the zero-lag variance: essentially all of the variance, i.e., the 
fluctuations in the absorption, can be explained as Poisson fluctuations in a randomly 
distributed population of clouds drawn from the distribution of equation (]T8|) . In the 
language introduced above, £p is as expected from equation (fHf ), and £s is consistent with 
zero. 



What about the covariance at nonzero lags, equation (4~0)? The solid circles in Figure 



7 show the result of calculating this quantity (or, actually, r 2 times this quantity) for the 
SSG data. Several remarks and caveats are necessary in interpreting this Figure: 

It is actually quite difficult to determine the zero point of the covariance with high 
accuracy. The structure function of the absorption data is found to increase slowly with 
increasing lag, as would be expected from small errors in fitting the the continuum slopes of 
the individual SSG quasars. This causes the raw correlation function (essentially a constant 
minus the structure function) to have an ill-defined additive constant (see e.g. §3 of Press, 
Rybicki, and Hewitt 1992). We fix the zero point in Figure 7 by the assumption that there 
should be no significant mean correlation £ on scales of 10 to 20 bins (corresponding, we 
will see below, to several tens of comoving Megaparsecs) : In this range, we fit for an r.m.s. 
continuum slope error, and extrapolate the resulting fitted model back to smaller lags. 

The astute reader might notice that the value at zero lag, 

Var(Q)/Q 2 = 0.073 ± 0.008 (49) 



is different from Paper I's value, quoted in equation (|36"D above. The reason is the different 
method, just described, used to determine the zero point. (See Paper I for a description 
of the moving average method there used to remove systematic errors due to incorrect 
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extrapolations of the continuum spectra of the SSG quasars.) One would be correct in 
concluding that there may well be on the order of 0.01 in unremoved systematic errors in 
the data shown in Figure 7. The error bars shown are determined by resampling of the SSG 
quasars, and should accurately reflect the statistical errors, but not the systematic errors, 
of the points. 

The solid line in Figure 7 is the result of predicting Var(Q)/Q from equation (0), a 
square slit response function of width 25A (SSG's value), and assuming Poisson distributed 
clouds (equations [37] and [47D with no additional correlation function, i.e., £s = 0. There 



are no new free parameters in this prediction, i.e., it is not normalized to the data in any 
way. We see that not only is the agreement excellent at zero lag, but also the expected 
slit response is accurately reproduced at lags 1 and 2. In bins 3 through 7 or 8, one sees a 
very interesting "shoulder" decaying from about 0.01 (or £ = 0.01r~ 2 « 0.02), to about half 
this value. While there may be some diffraction widening of the slit response present in 
bin 3, it is quite unlikely that it should extend to bin 7 (70A out). Further, the shoulder is 
highly significant in terms of the statistical errors measured by resampling. Unfortunately, 
the shoulder is at about the amplitude where we do not have confidence in our removal 
of systematics. We consider it suggestive as a marginal, but not believable, detection of a 
correlation function at high redshifts and large scales. 

With greater confidence we can assert a value of twice the shoulder value as a good 
upper limit to the value of the correlation function £ on scales from 3 to 7 bins. That is, at 
redshift z ~ 3.5, 

£ < 0.04 for AX obs = 30 A 

£ < 0.02 for A\ obs = 70 A (50) 

The relation between AX b. s and comoving distance scale Ar was already given in equation 
(p3|). It follows that the 30A and 70A scales in equation fl50|) correspond to 7.8 and 18 h^ 1 
Mpc, respectively, if « 1; or to 16 and 38 h^ 1 Mpc, respectively, if Q ~ 0. 

It is tempting to identify the apparent shoulder as the progenitor, in the gravitational 
instability picture, of correlation structure that is seen today. In an Q = 1 universe, 
the growth of perturbations in the linear regime would increase £ to the present by a 
factor (1 + z) 2 ~ 20, so that the scaled values of equation (^0) are within a plausible 
factor of today's observed values (e.g., the IRAS galaxies in Saunders et al. 1992) on the 
corresponding spatial scales. Indeed, one might hope to determine the value of Q from 
such an analysis, since the growth factor of perturbations to the present is suppressed by a 
known factor in lower Q cosmologies. 

The problems with carrying out this program include (i) the uncertain systematic 
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errors in Figure 7; (ii) the fact that, if simplistic comparison with today's correlation 
function is justified at all, then one predicts much larger covariances at lags to 2 than are 
seen; (iii) the fact that the Lyman a clouds are particularly likely, in most simple theoretical 
pictures, to be "ionized away" when they are nearest to density peaks where quasars (or 
other ionization sources) may have formed. In fact, item (iii) may be the explanation of 
item (ii); but it is clear that no simple analysis can carry any degree of credibility. We 
therefore defer these issues to a later paper. 

Here, sticking more closely to the data, it is of interest to see if any evolution with 
redshift can be detected in the correlation properties of the SSG sample. Figure 7 also plots, 
as smaller open symbols, the results of dividing the sample into 3 redshift ranges, z < 3, 
3 < z < 3.5, and 3.5 < z. The corresponding dashed curves are the predictions of the pure 
Poisson model, equation (0), with no adjusted parameters and with the assumption of a 
Poisson distribution, £s = 0. In general, the prediction for higher redshifts is towards larger 
fluctuations, due both to the n dependence in equation (^) and to the explicit dependence 
on 1 + z in equation (f47|). One sees that, to the level of possible systematic errors previously 
described, there is good agreement for the two higher redshift samples, with mean redshifts 
(z) = 3.3 and (z) = 3.8. For, the lowest-redshift sample, with mean redshift (z) = 2.8, 
there is some indication (with large statistical error bars, as shown in the Figure) of an 
excess variance - the triangles (highest) are about 2-a from the dotted (lowest) curve. The 
excess has the triangular instrumental profile, suggesting that it could be explained as an 
unresolved correlation function (equation ^) with 



This value is several times larger than the value that we infer from the data of Webb (1986) 
as shown in Carswell (1989); however the error bars are large, so the discrepancy may not 
be real. 

Qualitatively, our data support previous suggestions, e.g., by Carswell on the basis 
of higher-redshift data from Atwood et al. (1985) and Carswell et al. (1987), that the 
correlation function grows from an undetectable to a significant level in the fairly narrow 
redshift range between z ~ 3.5 to z ~ 2.5. (See also Shaver, 1988.) This redshift range 
also marks the epoch where (i) there is a change in the value of 7, the exponent relating 
cloud density and redshift (see, e.g., Figure 9 in SSG), and (ii) metal line clouds become 
important. We think that it is fair game for theorists to infer a connection among these 
phenomena. In particular, it may be that the spatially correlated population of Lyman a 
clouds that appears at redshifts z ^ 2.8, perhaps associated with galaxy formation, is a 
completely distinct population from the more rapidly disappearing (towards low redshifts) 
population of primordial, spatially uncorrected, clouds. 




(51) 
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5. Where Are All the Baryons? 



Models for light element production in the hot big bang give good limits on QbH^, 
where Qb is the present fraction of critical density due to baryonic matter. Recent analyses 
(Olive et al. 1990, Walker et al. 1991) give the value and uncertainty 

n b h 2 = 0.013 ± 0.003 (52) 



Obviously it is of interest to know whether this density of baryons can be accommodated, 
at high redshift, completely in the observed density of Lyman a clouds. By far the greatest 
uncertainty, as we shall now see, lies in the ionization state of the clouds and, in turn, in 
the question of whether the ionization state is consistent with the interpretation of observed 
b values as thermal, and/or with an equilibrium thermal and ionization model. 

The clouds are predominantly ionized. Measured, or deduced, column densities N refer 
in the first instance to neutral hydrogen (whose corresponding volume number density we 
will denote n), while the value of is determined by the corresponding column or volume 
densities of total hydrogen, which we denote Nh and nn- Let fo denote the hydrogen 
neutral fraction; and let p denote the total cosmological baryon density (including an 
assumed helium mass fraction « 0.25); so we have the relations 

N = f N H n = f n H p = 1.33m p n H (53) 



We can compute the mean (smoothed) density of hydrogen at a given redshift z 
along the line of sight by knowing, as we do, the mean density of clouds with each possible 



column density N (equation 18), as follows: 

dz dNfj 




A/" (l + z^N^dN, 



id 



= 2.8 x 10- u h chT 3 (1 + zy +2 (l + Qz) 1/2 / -^f— dN u (54) 

Here r p is physical (proper) distance (see equation |23D, and fo(N) denotes the harmonic 
mean value of fo for all clouds with neutral column density between N and N + dN, 
averaging over any other internal parameters (notably b). That is, fo{N) is the value 
defined by N H = N/f (N). It is an important point that, if fo(N) is averaged correctly, 
then equation ([54]) is independent of the physical size, shape, and distribution of the clouds, 
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e.g., whether they are spherical, or thin sheets, clumped or unclumped, etc. Equation ( 154|) 
requires only that a random line of sight be a fair sample of the universe. 

We can rewrite equation (|54D as a direct measurement of Qbh 2 , 

^ = 1_33^, 8g = 3 4 x 1Q _ 9ft(1 + zy ,_ 1(1 + fe)1/2 / flg^ (55) 

where i?o = 100 km s _1 . 



5.1. Inconsistency of Equilibrium Model with 

Let us consider first the baseline hypothesis that the clouds are in thermal and 
ionization equilibrium with a background UV flux, and that the measured b velocity widths 
are thermal. (We will see, in fact, that this baseline hypothesis leads to an observational 
reductio ad absurdum.) 

It follows from the analysis of Black (1981) that there is a broad regime of density 
p and incident UV intensity J v (Lyman limit intensity in units ergs cm _2 s _1 Hz _1 sr _1 ) 
where both the temperature of the gas, and also its neutral fraction, depend only on the 
combination J v j p. In particular, if p and J v are in c.g.s. units, then 

/ j \ 2/7 / j v 1/7 

T = 1690 K p r-j\ or b = 5.3 km s~y /2 i^-j (56) 

and 

fo = 397 f -j\ (57) 

(Here p is the mean molecular weight of the ionized gas.) According to Black, 
these expressions are applicable at least in the range 5 x 10 3 < T < 5 x 10 5 K, 
10~ 6 <n H < 10~ 3 cm~ 3 , 10~ 22 < J u < 1(T 20 (c.g.s; note that Black's J is 47r times our J u ). 

Equations fl56|) and ( |57|) imply a unique relationship between the neutral fraction / 
and the velocity (or temperature) parameter b, 

/o = (>.) ^ (JL.)" 1 = 3.7 x 10- f ' ) (58) 
J0 V8.56km/ S ; V0.647 \37 km/sj V0.64/ v ; 

and a relationship among uh, b, and J u , 

n H = -E— = 1.2 x 10- cm" 3 f 6 ^ ~ ? f JL)^ f f ) ( 59 ) 
1.36m p V 37km /s/ V0.64/ V 10 " 21 c.g.s/ V ; 
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(Hereafter, we will take \i = 0.64 and suppress the parametric dependence on /i.) Obviously 
the large values of the exponents in b require that we use these relations with some caution. 
Various authors have estimated J u as being in the range 10~ 22 to 10~ 21 ; see Bajtlik, Duncan, 
and Ostriker (1988) for discussion. 

A first application of equation (|55|) is to assume that f is independent of N and 
uniform throughout any given cloud. This would be true for the base case of pressure 
confined clouds in a uniform confining medium and with a uniform UV illumination. Then 
there is no subtlety in averaging fo(N); it can come out of the integral as fo in equation 
(|58|). The remaining integral is weakly divergent, so we must adopt some value N max for a 
cutoff in the column density, in terms of which we get, 

/ b \ 8,54 

n b h 2 = i.6 x icr 3 Mi + ^(l + viz) 1 ' 2 l^—r) NtfL. (60) 

Here the normalizing value for b, namely 37 km s _1 , has been chosen to be the observed 
mean value for b (Table 1). 



One sees that the cosmologically observed value for fl b h 2 (equation ^2j) is not just 



easy to produce, it is easy to vastly exceed: Table I shows that the appropriate average 
(6 8 - 54 } 1//8 ' 54 is apparently on the order of 70 km s -1 ; and N\4, max is surely not less than 
100, corresponding to N max = 10 16 cm~ 2 , and probably greater. These values would imply 
flbh 2 ~ 5h, which is excluded even for a fully baryonic Q = 1 universe! 

One might momentarily wonder whether internal structure within a single cloud, with 
the temperature T varying along the line of sight provides a possible resolution. However, 
such variation always acts in the wrong direction: It is possible to hide more baryons within 
a cooler, narrower, undetectable line core; but baryons implied by an observed thermal line 
width must always be there. 

One can turn the argument around by inverting equation (^0]) for b. (Now we are on 
the good side of the large exponent, and thus quite insensitive to the other assumptions 
made.) One finds that, for N max in the plausible range 10 18 cm~ 2 down to 10 16 cm -2 , the 
observed Q b is consistent with an ionization fraction f that would in equilibrium derive 
from thermal values of b in the range 20 to 25 km s -1 . Looking back at Figure 5, one sees 
that this velocity range is where the observed b distribution is falling off rapidly to smaller b 
values; in other words, considering the possibility of observational errors and other sources 
of dispersion, the deduced equilibrium b resembles a lower bound to the observed b values. 



5.2. Inconsistency of Equilibrium Model with Size and Mass of Clouds 
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Let us follow the previous logic to the extreme. Equations (|58| ) and fl59|) give a 
characteristic length scale L for a cloud with parameters N, b, J u , namely, 

L ~ ^ ~ 7 ' 3 x 105 pc (37^) " M M^) 1 (61) 

There are two separate points to be made about equation fl6T|). First, the value obtained for 
L is implausibly large, not only by the apparent factor of ~ 100 for the parameters having 
the scaling values given (see discussion of cloud sizes at the end of §3.3), but also by an 
additional factor of about 2 15 54 ~ 4 x 10 4 if the distribution for b has the broad tail found 
in §3. 

In view of the large exponent on b, a better use of equation (|6l| ) is, as before, to run it 
backwards: Suppose that L is in the range 10 4±1 pc, as seems required by other observations, 
and that the other parameters have the scaling values given. Then (|6l"D implies 

/ 1 e 1/15.54 - 

(& 4 J 54 ) « 28 ± 5 km s- 1 (62) 

where b eq is the thermal b value that would be in equilibrium with the required neutral 
fraction f . 

The point can be made even more forcefully if we write the characteristic mass M of 
the cloud (more accurately, the mass of that part of the cloud whose size is on the order of 
the line of sight penetration), 

/ N \ 3 / h \ 39,62 / T \~ 2 
M ~ P L> ~ 1.48 x W -M 3 (JL^) (^) ( IjF *_) (63) 

Here the value of the exponent in b is truly worthy of awe. If we reverse the equation, and 
assume M in the range 1O 8±3 M , which is cosmologically plausible, we get 

(&J 62 ) 1/39 ' 62 w 29 ± 6 km s- 1 (64) 

An additional interesting point about equation (^) is that, because of the enormous 
exponent, it is simply not possible to accomodate any significant range of b within the 
cosmologically available mass range for M, if b is related to neutral fraction by equation 
(|60|) or anything similar. A factor of two variation in b from cloud to cloud, at fixed N, 
induces a mass range of a factor 10 12 ! Also note that the observed range of N, which we 
take conservatively in this paper to be ~ 10 3 , itself induces a mass range of a factor 10 9 . 
Moreover, many observers (e.g., Petitjean, 1992) show evidence of iV itself extending over 8 
orders of magnitude! 
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6. Possible Resolutions 

Quite obviously, the UV-driven equilibrium thermal model for b is unviable. On the one 
hand (§3), a broad distribution of fe's, extending to 80 km s -1 or higher, is observationally 
required. On the other hand (§6), the neutral fractions implied by such values of b are 
incompatible with observed limits on Q^, the size and mass of the clouds, and the dynamic 
range of mass available to the clouds. 

Any possible resolution of the paradox must substantially disconnect a measured value 
of b from the ionization state of its cloud. There would seem to be two ways to do this. 
First, as has been long noted in the literature, the observed value b could be due to bulk 
matter motions rather than thermal velocities. 

The bulk motion hypothesis raises a range of theoretical difficulties that have been 
discussed by other investigators. More observationally, in the context of this paper, it is 
worth noting that, if b includes a significant component of bulk motions, then any curve of 
growth analysis based on Voigt profiles (including that of §2) will be very seriously in error 
in the saturated region. In particular, if, within a single cloud, the velocity tail falls off less 
rapidly than a Gaussian (as one would expect of the broad-tailed distributions characteristic 
of hydrodynamic motions), then the actual value of N, the neutral column density, can 
be much less than deduced from standard curve of growth. The widely observed number 
distribution of iV's up to high values is then based on an incorrect analysis, and the fact 
that the observed distribution is a power law that extrapolates smoothly from smaller N 
values becomes something of a miracle. 

One should also mention the observational issue usually termed "the b — N controversy", 
and described in Shaver, Wampler, and Wolfe (1991). There, claims is of a positive observed 
correlation between b and N (which would be theoretically expected in most mechanisms 
for bulk motion) have proved quite difficult to substantiate and are thought by many 
to be entirely due to selection effects. In this context we should note the success of our 
assumption, in §3, of uncorrelated TV's and 6's in satisfying all observational constraints 
considered. 

Finally, it is hard to reconcile most mechanisms for generating bulk motion with 
the observed lack of cloud clumping on small scales (§4). Hydrodynamic or gravitational 
processes capable of driving highly supersonic differential motions within a cloud should 
also be expected to act on scales comparable to intercloud distances (§3.3). Our analysis 
of fluctuations in §4 would easily have detected (e.g. in the zero-lag bin of Figure 7) any 
tendency for clouds to be clumped in groups as small as a few, or for that matter any 
comparable tendency for clouds to be more uniformly distributed than random. The lack 
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of such a detection tends to support less violent scenarios of cloud formation and evolution 
than those that can give large bulk motions. 

An alternative to the bulk motion hypothesis (as particularly emphasized to us by M. 
Rees; see also Rees 1988) is the notion that observed fe's do represent the thermal cloud 
state, but that this thermal state is not in equilibrium with the ionization state determined 
by interaction with the UV background. In particular, it may be possible to place the 
clouds in a physical regime where their thermal cooling times are not short compared with 
the (then) Hubble time. 

If such a picture is combined with the idea that we are seeing clouds that have collapsed 
and heated adiabatically, then a consistent picture may possibly emerge: The neutral 
fraction /o is determined over time by interaction with a background UV flux. There is an 
implied concomitant universal heating of the gas, to a temperature that one might identify 
with a value at or below the lower edge of the b distribution in Figure 5. 

Subsequently, clouds collapse by volume factors that vary somewhat from cloud to 
cloud, giving the observed broad, indeed thermal, distribution of b. Since adiabatic heating 
is inefficient at altering ionization, the values /o remain in a universal, narrow range, as 
required by the arguments of §5. In Paper III we will investigate this, and related, models. 



7. Conclusions 

The conclusions of this paper are as follows: 

1. The equivalent width ratios for the Lyman sequence found in Paper I are completely 
consistent with a featureless power law distribution of Lyman a clouds in column density 
N, with exponent (3 = 1.43 ± 0.04. Distributions with a break in the power law can also fit 
the equivalent width ratio data when the break is around N = 10 15 cm~ 2 . 

2. Assuming the above featureless power law distribution in N, standard curve of 
growth analysis is able to reproduce the detailed distribution of cloud equivalent widths 
(notably both the exponential tail at large equivalent widths, and the turn-up at small 
equivalent widths), but only for distributions in b that have mean values ~ 37 km s _1 and 
a significant tail extending as high as 70 km s -1 . Such distributions are in fact observed by 
some observers. 

3. The turn-up at small equivalent widths likely does not mark any change in the 
properties of the clouds, but is a consequence purely of the curve of growth along the 
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observed line of sight. 

4. Broken power laws, which do fit the N distribution, do not fit the equivalent width 
distribution as well, but they are not wholly ruled out. 

5. The long-observed exponential tail in the distribution of equivalent widths can be 
real without being physically fundamental: it emerges as an artifact of combining the tail 
of the distribution in b with the slowly rising equivalent widths characteristic of the curve 
of growth in the saturated region. 

6. The posited distributions in N and b, plus Paper I's normalization of the absorption 
as a function of redshift, give an absolute normalization on the number of clouds as a 
function of N, b, and z. Upper limits on the physical size of clouds, in the range of 50 to 
120 h' 1 kpc at z ~ 4.2, follow. 

7. At the highest redshifts studied, the posited distributions in N and b are able 
completely to account for fluctuations in the absorption (along different lines of sight or as 
a function of redshift), as due to Poisson fluctuations in an uncorrelated cloud distribution. 
However, there are marginal (not by themselves very believable) detections of a nonzero 
correlation function in two different regimes: (i) At the lowest redshift accessible to this 
study, z ~ 2.8, there is some evidence of unresolved correlation on a scale ^ 2/i~ 1 Mpc, 
possibly with amplitude somewhat greater than previously reported, (ii) In the full sample, 
on scales of tens of Mpc, there is some evidence of correlation at a level £ ~ 0.01 or 0.02. 
Because of possible systematic errors, however, we prefer, to take twice these values as 
upper limits. 

8. If the observed b distribution is thermal, and if the ionization state of the gas is 
related to its thermal state by an equilbrium model (e.g., Black 1981), then the observed 
broad b distribution would imply 1 in clouds. Similarly, the broad b distribution 
implies clouds that are both too large and too massive. 

9. On the other hand, if the ionization state is that implied by a UV- heated 
temperature of about 20 to 25 km s _1 , then at high redshift the Lyman a clouds contain 
all the baryons deduced from models of light element production in the hot big bang (i.e., 
probably all the baryons in the universe). 

10. The broad b distribution must therefore be ascribed either to bulk motions (which 
raises both theoretical difficulties and likely incompatibilities both with the observed 
featureless power law distribution in iV obtained from curve-of-growth analyses, and with 
the observed lack of small-scale cloud clumping), or else the observed clouds must be out 
of thermal equilibrium, and heated by a process that is relatively inefficient at ionization. 
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Recent adiabatic collapse is a plausible candidate. 

We have benefited from discussions with Martin Rees, Don Schneider, John Bahcall, 
David Spergel, John Black, Ed Turner, and John Huchra. This work was supported in part 
by the National Science Foundation (PHY-91-06678). 
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Fig. 1. — Equivalent width W for Lyman series lines in hydrogen as a function of column 
density N and thermal velocity parameter b. Equivalent widths grow linearly with N in 
the unsaturated region below N ~ 10 14-15 cm -2 , until they saturate at a value set by the 
thermal doppler width. In the saturated regime, widths grow slowly as ~ log 1/2 N. 

Fig. 2. — Goodness-of-fit x 2 values as a function of exponent (3 for fitting observed Lyman 
series absorption ratios and a curve of growth model to a power law model for the number 
of clouds as a function of their column density N. With four measured ratios, there are 
four degrees of freedom. The value of the exponent is seen to be narrowly determined (note 
expanded abscissa) independently of the assumed value for the velocity parameter b. 

Fig. 3. — Goodness-of-fit \ 2 values as in Figure 2, but fitting for the characteristic column 
density iV associated with the broken power-law model of equation (13) [broad parabolas] 
or the unrealistic model of a single, universal N [narrow parabolas]. One sees that, while 
none of these models are excluded by consideration of absorption ratios alone, the ratios are 
fairly powerful at fixing one scale parameter. 

Fig. 4. — Relative number of Lyman a clouds as a function of their rest equivalent width. 
Filled circles: data compiled by Murdoch et al. (1986). Note the famous exponential 
tail extending over > 2 decades in cloud number density. Solid, long- and short-dashed 
curves: Predictions based on curve of growth analysis and the three distributions for velocity 
parameter b that are shown in Figure 5. The exponential tail emerges naturally from b 
distributions with (b) ps 37 km s _1 and <r& ps 20 km s^ 1 (see text for details). Dotted and 
dash-dot curves: Models which deviate from either a broad b distribution or a pure power- law 
distribution in N do not give acceptable fits to the observed equivalent width distribution. 

Fig. 5. — Solid histogram: Carswell's (1989) compilation of the distribution of b values 
for Lyman a clouds. Long- and short-dashed curves: gamma law and truncated Gaussian 
distributions, each parametrized by a mean and width, which give best fits to the observed 
equivalent width distribution shown in Figure 4. One sees that the behavior of the b 
distribution at small values of b is not well constrained, but that both models, and Carswell's 
data, indicate a tail of large b values extending to > 80 km s -1 . 
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Fig. 6. — To determine the range of column densities N that are tested by the excellent 
agreement of models with data in Figure 4, the mean value N of clouds with a specified 
equivalent width is here plotted against that equivalent width, for four of the b models 
shown in Figure 4. One sees that all the models probe the range N ~ 10 13 cm -2 to iV ~ 10 16 
cm -2 . 



Fig. 7. — Fluctuations in transmission along the line of sight, here shown as the fractional 
covariance of absorption along a single line of sight measured at two wavelengths separated 
by an integral number of 10A bins. Solid dots: observed covariance for the full SSG sample 
of quasars. Error bars are determined by bootstrap resampling. Solid curve: prediction 
(with no adjustable parameters) for unclustered clouds with a distribution in z, N, and b 
determined in this paper without reference to fluctuation statistics. The triangular shape 
comes from the instrumental response of an assumed square slit. The good agreement 
between the data and the prediction put strong limits on a cloud correlation function £. 
The shoulder in bins 3 through 7, if real, implies a value £ ~ 0.02 on a comoving scale 
~ 10 ft, -1 Mpc at redshift z ~ 3.5. However, systematic errors in the determination of the 
zero point are also of the same order, so the detection is suggestive only. (See text for details.) 
The open symbols and dotted curves repeat the analysis for low-, medium-, and high-reshift 
subsamples of the full sample. There is weak (about 2 a) evidence for the appearance of a 
positive correlation function in the lowest-redshift sample (see text). 



